I attended a session in AWS Summit Tokyo on 「AXA Life Insurance Ideal Form of Data utilization “lakehouse” for centralized analysis and forcasting」
Introduction
Hemanth of Alliance Department here. In this blog, I'll be writing on "Ideal data utilization situations and lakehouse for centralized analysis and forecasting" session in AWS Summit Tokyo 2023 [AP-31] . As the session was in Japanese,I have put in my best effort to put up all the content in english here.
Main Speaker
Yuka Inose from AXA Life Insurance Co., Ltd Data Solution
Agenda
1) The opening was done by Databricks Japan Co., Ltd
2) The introduction of Lakehouse - Why Lakehouse necessary now?
3) Conclusion
Opening
Data Lakehouse Platform of Databricks 1) Data warehouse
2) Data Engineering
3) Data Streaming
4) Data Science, ML
There are other useful things such as the unity catalog, Delta Lake and Lakehouse Platform as said by Databricks -
1) simple - In one platform integrate data warehouses and AI use Cases
2) Open - open source base. no need to take data out from customer's AWS environment
3) TCO reduction - processing costs are down by 91% when compared to traditional data platforms
The Introduction of Lakehouse - why is Lakehouse necessary now?
Divided into 4 sections
Data Area Trajectory So Far
Before 2018 ["Think well first" phase started ] Birth of the first DataLake - used only by Data Scientists
2019-2020 [Entered into "It's ok to fail so act first" phase] Birth of second generation Data Lake focused on value chain and Data accumulation
2021-2022 Data Lake utilization of accumulated Data Metadata Management
2023-2024 DL/DWH/DM/MLOps Integration Lakehouse in demand. The rise of investment in this field.
2025- Lake house utilization every employee becomes familiar with analyis and machine learning
Turning point for Lake House Investment
3 initiatives that changed it
1) Data Lake community - For people who haven't used Data Lake or Quicksight yet had opportunities at open door venues related to product and case study introductions.
2) Data Kiosk - people who already started using it. mechanism of all reception spots and support related to data
3) in-house sales activity - proposals to departments with interests and needs. lowering initial hurdles without a budget only at the start.
Necessity for Lakehouse
The challeges before are as follows
complicated - traditional reporting system and multiple independent data lakes existed - only users with high skill set could used
retrodiction - how the past was and what future holds, mindset of "past+present" to "past+present future" Closed
create an environment to withstand increasing needs of machine learning use cases in future
Future Prospects
when looked from from operation side it's about the business, support and agency and how AI successfully implemented for personnel use. From analytics side it's about the insight into the future then to data team and business departments. Then from both view it comes down to diverse data for analysis and making use of AI which directly propotions to lakehouse.
Conclusion
3 main things that was conveyed during the session
1) Creating a masterpiece of data utilization that can be experienced throughout the organization
2) Just by creating simple structure draws out the underlying strenght of the data team
3) AI utilization partner for each employee which inturn is by building a lake house